The Key Approach to Translation: Word Alignment Models

نویسنده

  • Timothy Liu
چکیده

This paper focuses on a key aspect of Statistical Machine Translation: word alignment. Various word alignment models are presented, first differentiating between methods and then highlighting the preferred method. A partially detailed mathematical explanation is provided for each model as well as a brief implementation of the Expectation Maximization Algorithm (EM Algorithm) for later models. Furthermore, statistical and error analysis follow each segment of models. The purpose of this paper is to show an integral sub problem that Statistical Machine Translation must deal with and how some computational linguists and computer scientists go about doing it. General Terms EM Algorithm, Statistical Machine Translation (SMT)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Domain-Specific Word Alignment for Computer Assisted Translation

This paper proposes an approach to improve word alignment in a specific domain, in which only a small-scale domain-specific corpus is available, by adapting the word alignment information in the general domain to the specific domain. This approach first trains two statistical word alignment models with the large-scale corpus in the general domain and the small-scale corpus in the specific domai...

متن کامل

Statistical machine translation: from single word models to alignment templates

In this work, new approaches for machine translation using statistical methods are described. In addition to the standard source-channel approach to statistical machine translation, a more general approach based on the maximum entropy principle is presented. Various methods for computing single-word alignments using statistical or heuristic models are described. Various smoothing techniques, me...

متن کامل

Association-Based Bilingual Word Alignment

Bilingual word alignment forms the foundation of current work on statistical machine translation. Standard wordalignment methods involve the use of probabilistic generative models that are complex to implement and slow to train. In this paper we show that it is possible to approach the alignment accuracy of the standard models using algorithms that are much faster, and in some ways simpler, bas...

متن کامل

A Discriminative Framework for Bilingual Word Alignment

Bilingual word alignment forms the foundation of most approaches to statistical machine translation. Current word alignment methods are predominantly based on generative models. In this paper, we demonstrate a discriminative approach to training simple word alignment models that are comparable in accuracy to the more complex generative models normally used. These models have the the advantages ...

متن کامل

Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond

Bilingual word alignment forms the foun-dation of current work on statisticalmachine translation. Standard word-alignment methods involve the use ofprobabilistic generative models that arecomplex to implement and slow to train.In this paper we show that it is possibleto approach the alignment accuracy of thestandard models using algorithms that aremuch faster...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008